** Data reading and variable selection from raw data
** Korean General Social Survey 2006


** 01. Reading data **

cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/  
numlabel, add


** 02. Consructing year and country variables **

ge year=2006
lab var year "survey year"

ge country=410
lab var country "ISO country code"
//South Korea: 410 (ISO Country Codes) 


** 03. ID variables **

ge pid=RESPID
lab var pid "person id"

ge hid=HHDNO
lab var hid "household id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=SEX
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=AGE
lab var age "age"

ge birthyr=year-age
lab var birthyr "year of birth"


** 05. Siblings **

ge nbro=SIBNO1+SIBNO4
ge nsis=SIBNO2+SIBNO5

ge nsibs=nbro+nsis

ge birthorder=SIBNO1+SIBNO2+1

lab var nbro "number of brothers"
lab var nsis "number of sisters"
lab var nsibs "number of siblings"
lab var birthorder "birth order"

lab def nbro 176 "don't know/no answer"
lab val nbro nbro

lab def nsis 176 "don't know/no answer"
lab val nsis nsis

lab def nsib 352 "don't know/no answer"
lab val nsib nsib

lab def birthorder 177 "don't know/no answer"
lab val birthorder birthorder


** 06. Own education **

rename EDUC educ_a 
lab var educ_a "highest level of schooling attended"

rename DROPYR dropyr
lab var dropyr "grade at which the respondent dropped school"


** 07. Parents' education: Father and/or Mother **

rename PAEDUC faeduc
lab var faeduc "father highest level of schooling attended"

rename MAEDUC moeduc
lab var moeduc "mother highest level of schooling attended"


** 08. Own occupation **

rename OCC occ
lab var occ "respondent's occupation in ISCO 88 four digit code"


** 09. Parents' occupation **

rename RPAEMP faempstat
lab var faempstat "father's employment status"

rename RMAEMP moempstat
lab var moempstat "mother's employment status"


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Korea General Social Survey 2006

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc 

** R's Own Occupation **
tab occ 

** Parental Occupation **
tab1 faempstat moempstat 

log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nbro nsis nsibs birthorder ///
	 educ faeduc moeduc ///
	 occ faempstat moempstat 


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education**
** Own Education **
rename educ educ_cat

ge educ_yrs=0 if educ_cat==0
replace educ_yrs=6 if educ_cat==1
replace educ_yrs=9 if educ_cat==2
replace educ_yrs=12 if educ_cat==3
replace educ_yrs=14 if educ_cat==4
replace educ_yrs=16 if educ_cat==5
replace educ_yrs=18 if educ_cat==6
replace educ_yrs=20 if educ_cat==7
replace educ_yrs=. if educ_cat==8
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=020 if educ_cat==0
replace educ_ISCED=100 if educ_cat==1
replace educ_ISCED=244 if educ_cat==2
replace educ_ISCED=344 if educ_cat==3
replace educ_ISCED=550 if educ_cat==4
replace educ_ISCED=640 if educ_cat==5
replace educ_ISCED=750 if educ_cat==6
replace educ_ISCED=850 if educ_cat==7
replace educ_ISCED=. if educ_cat==8
lab var educ_ISCED "respondent highest education in ISCED code"


** Parents Education **
//father's education is actually father's
ge faeduc_flag=1 

rename faeduc faeduc_cat
rename moeduc maeduc_cat

ge faeduc_yrs=0 if faeduc_cat==0
replace faeduc_yrs=6 if faeduc_cat==1
replace faeduc_yrs=9 if faeduc_cat==2
replace faeduc_yrs=12 if faeduc_cat==3
replace faeduc_yrs=14 if faeduc_cat==4
replace faeduc_yrs=16 if faeduc_cat==5
replace faeduc_yrs=18 if faeduc_cat==6
replace faeduc_yrs=20 if faeduc_cat==7
replace faeduc_yrs=. if faeduc_cat==8 | faeduc_cat==88 
lab var faeduc_yrs "father's education in years"

ge maeduc_yrs=0 if maeduc_cat==0
replace maeduc_yrs=6 if maeduc_cat==1
replace maeduc_yrs=9 if maeduc_cat==2
replace maeduc_yrs=12 if maeduc_cat==3
replace maeduc_yrs=14 if maeduc_cat==4
replace maeduc_yrs=16 if maeduc_cat==5
replace maeduc_yrs=18 if maeduc_cat==6
replace maeduc_yrs=20 if maeduc_cat==7
replace maeduc_yrs=. if maeduc_cat==8 | maeduc_cat==88 
lab var maeduc_yrs "mother's education in years"

ge faeduc_ISCED=020 if faeduc_cat==0
replace faeduc_ISCED=100 if faeduc_cat==1
replace faeduc_ISCED=244 if faeduc_cat==2
replace faeduc_ISCED=344 if faeduc_cat==3
replace faeduc_ISCED=550 if faeduc_cat==4
replace faeduc_ISCED=640 if faeduc_cat==5
replace faeduc_ISCED=750 if faeduc_cat==6
replace faeduc_ISCED=850 if faeduc_cat==7
replace faeduc_ISCED=. if faeduc_cat==8
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_ISCED=020 if maeduc_cat==0
replace maeduc_ISCED=100 if maeduc_cat==1
replace maeduc_ISCED=244 if maeduc_cat==2
replace maeduc_ISCED=344 if maeduc_cat==3
replace maeduc_ISCED=550 if maeduc_cat==4
replace maeduc_ISCED=640 if maeduc_cat==5
replace maeduc_ISCED=750 if maeduc_cat==6
replace maeduc_ISCED=850 if maeduc_cat==7
replace maeduc_ISCED=. if maeduc_cat==8
lab var maeduc_ISCED "mother highest education in ISCED code"


** 14. Homoginising sibling**
//cutoff
ge nbro_flag=99
lab var nbro_flag "cutoff of number of brothers"
ge nsis_flag=99
lab var nsis_flag "cutoff of number of sisters"
ge nsibs_flag=99
lab var nsibs_flag "cutoff of total number of siblings"

lab def nsib_flag 99 "no cutoff"
lab val nbro_flag nsis_flag nsibs_flag nsib_flag

//recode missing
replace nbro=. if nbro==176
replace nsis=. if nsis==176
replace nsibs=. if nsibs==352

** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nbro nsis nsibs nbro_flag nsis_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace
